Conversation
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Unique Token 跟踪 (
scripts/infer_task.py)_unique_generated_tokens集合来跟踪唯一 token IDnext()生成新 token 时增量更新get_unique_previous_tokens()返回排序的唯一 token 数组批处理层 (
scripts/jiuge.py)JiugeBatchedTask从所有任务中收集唯一 tokenC++ 接口更新
inferBatchJiuge()和inferBatch()以接受previous_tokens_per_req和previous_tokens_len_per_reqInferRequest结构体以包含唯一 token 字段inferDeviceBatch()和inferDeviceBatchPaged()以传递唯一 tokenInferenceContext::randomSample()以接受并转发唯一 tokenPython 绑定 (
scripts/libinfinicore_infer/jiuge.py)inferBatchJiuge参数类型以包含唯一 token 数组infer_batch()方法签名API 服务器 (
scripts/launch_server.py)--port和--host参数用于服务器配置/models端点chat_template_kwargs透传